The Growing Hierarchical Self-Organizing Map (GHSOM) for analysing multi-dimensional river habitat datasets
نویسندگان
چکیده
River field surveys are carried out to describe biological habitats and the main geomorphic features of a river stretch. They can be extensive, expensive and time consuming campaigns sampling a high number of features. These features belong to a complex river ecosystem characterized by many different processes at various scales from simple to highly non linear. Researchers need sophisticated techniques to manage this multi-dimensional dataset which was so arduously obtained. In this paper we apply an algorithm, the Growing Hierarchical Self-Organizing Map (GHSOM), to analyse the River Habitat Survey (RHS) database in the UK. The RHS is a system for assessing the character and quality of rivers based on their physical structure. More than one hundred variables were sampled for each survey site. They were sampled at more than 10 000 sites over the past ten years. The GHSOM is a variant of the SOM algorithm which is particularly useful for explorative data mining of multi-dimensional datasets. It produces an intuitive representation of hierarchical relations in the data. More than seventy ordinal variables, each representing the occurrence of a feature in the river stretch, are analyzed for 7000 sites, and hierarchical patterns are obtained. The algorithm produces hierarchical structure of four layers of clusters: from a general classification of stream habitats composed of 6 clusters to a very fine one with a few hundred clusters. This complex hierarchical structure is firstly interpreted labelling the clusters with the most frequent features, i.e. using just its input variables. We are interested to assess how closely a habitat type (as defined by a cluster) corresponds to a river type (as defined using different river classifications). A specific index able to assess the supposed link between river types and stream habitats is developed. It is able to quantify the distribution of different river types across the hierarchical clustering structure: it is calculating how much a stream habitat is common amongst different typologies or in other words how well it could be representative of a specific river type. Two different river classifications are analysed. One has been developed by UK Environment Agency (EA) to disseminate the results of the first national RHS. The other has been proposed by UK as a river typology classification to meet the requirement of the Water Framework Directive (WFD). The latter shows a weak link amongst river types and stream habitats. Instead a river classification based on natural drivers such as geology, slope, mean annual discharge and altitude developed by EA shows a much stronger link. These results draw interesting insights of the key roles of natural driving forces on the geomorphic processes responsible of stream habitat formations which deserve further analysis. The hierarchical structure allows furthermore assessing this link for different type of habitat classifications from broad to details ones. Analysing if much finer stream habitat classifications, i.e. composed of high number of clusters, allow a better link with river types creates the possibility to identify which aspects of stream habitats, e.g. very general (different eco-regions) or very detailed (various management of riparian vegetation) ones, are more sensitive to identify river types and to investigate the possible reasons for it. The framework presented is then suitable to analyse the influences of stream habitats on a full range of environmental objectives. In the present work we analyse river classifications, but the same approach could be applied to other components of the fluvial ecosystem such as fish or invertebrates communities. This approach has capabilities to improve our understanding of the fluvial ecosystem and to bring management benefits. It gives the possibility to develop optimum habitat classifications able to meet management requirements and to minimize the number of habitat classes identified. This output could then produce management benefits addressing the characterization of habitat status and the planning of the future monitoring campaign, which could be optimised in relation to the classification adopted.
منابع مشابه
An Intrusion Detection Method Based on Improved Growing Hierarchical Self-Organizing Map
Growing hierarchical self-organizing map (GHSOM), as a kind of topology map, is an effective method to process large scale data. It not only enjoys the advantages of self-organizing map (SOM), but also owns its special multi-layer hierarchical structure which is easy to reveal the hierarchical structure behind the input data by using GHSOM. Though GHSOM has made great progress on the improvemen...
متن کاملGrowing hierarchical self-organizing map method using category utility
In order to automatically obtain hierarchical knowledge representation from a certain data, an unsupervised learning method has been developed that overcomes two problems of the growing hierarchical self-organizing map (GHSOM) method, which uses the quantization error, the deviation of the input data, as evaluation measure of the growing maps: proper control of the growth process of each map is...
متن کاملThe growing hierarchical self-organizing map: exploratory analysis of high-dimensional data
The self-organizing map (SOM) is a very popular unsupervised neural-network model for the analysis of high-dimensional input data as in data mining applications. However, at least two limitations have to be noted, which are related to the static architecture of this model as well as to the limited capabilities for the representation of hierarchical relations of the data. With our novel growing ...
متن کاملNew Variant of the Growing Hierarchical Self Organizing Map GH-DeSieno-SOM for Phoneme Recognition
The Growing Hierarchical Self-Organizing Map (GHSOM) is a network of neurons whose architecture combines two principal extensions of SOM model, the dynamic growth and the tree structure. This paper presents a variant of the growing GHSOM. The proposed variant is like the basic GHSOM. However, it is characterized for each neuron of each map level by a conscious term which takes into consideratio...
متن کاملAutomating Ontology Generation for Information Systems Research Using GHSOM
Building ontology for a specific field of research is a very tedious task; yet, very important. Ontologies can help in defining the boundaries of a discipline and identifying new emerging streams of research. Automating this process reduces, if not eliminates, the overhead associated with manual ontology building methods and gives a big jump to continue refining and improving the generated onto...
متن کامل